BMC Medical Informatics and Decision Making — Latest Matching Preprints

1

Early Warning Model for Patient Deterioration: A Machine Learning Approach for Nurse-Led Monitoring

Ahmed, E.; Omer, M.; Endris, N.

2025-06-20 health systems and quality improvement 10.1101/2025.06.20.25329978 medRxiv

Top 0.1%

60.8%

Show abstract

The early recognition of clinical deterioration in hospital inpatients continues to be a major challenge in healthcare. In this work, we proposed an intelligible machine learning (iML) based EWS for predicting patient deterioration events and facilitating early nurse interventions. We compare a range of supervised learning models, including gradient boosting and logistic regression on electronic health record (EHR) data, emphasizing predictive performance and model interpretability. To be more clinically trusted and usable, we include an integration of SHAP (SHapley Additive exPlanations) for feature attributions and inline interpretability in alert interfaces. Our findings show that the proposed system not only has high predictive performance but also has a significantly positive impact on the nurse response behavior upon the actionable and interpretable alert generation. Transparency and user-centered design are further emphasized as keys to encouraging adoption in clinical practice. These results are a step toward the larger goal of incorporating AI into healthcare processes in a way that does not erode safety, trust, or human supervision. Author summaryMonitoring patient decline is a crucial but serious problem in hospitals as it should be detected as early as possible. This work aimed to design an interpretable machine learning (iML) model to predict clinical deterioration based on patient electronic health records. In contrast to the classical black-box model, our system is highly accurate, transparent, and can be used by frontline healthcare professionals. We tested several machine learning methods and added SHAP (Shapley Additive exPlanations) to provide insight into how we make predictions. These explanations are on the alert interface, and this makes it easier to understand and trust the system by the nurses. Our findings demonstrate that the alerts enable nurses to react faster and better to care that resulting in better care coordination. This study emphasizes the need to develop AI tools that complement the clinicians judgment, rather than substitute it, and that are simple to add to the existing hospital procedures. Our framework presents a solution in enhancing patient safety via AI without sacrificing the aspects of human control and confidence.

2

Improving and Interpreting Surgical Case Duration Prediction with Machine Learning Methodology

Lai, J.; Huang, C.-C.; Liu, S.-C.; Huang, J.-Y.; Cho, D.-Y.; Yu, J.

2020-12-08 health systems and quality improvement 10.1101/2020.06.10.20127910 medRxiv

Top 0.1%

59.3%

Show abstract

Predictive accuracy of surgical case duration plays a critical role in reducing cost of operation room (OR) utilization. The most common approaches used by hospitals rely on historic averages based on a specific surgeon or a specific procedure type obtained from the electronic medical record (EMR) scheduling systems. However, low predictive accuracy of EMR leads to negative impacts on patients and hospitals, such as rescheduling of surgeries and cancellation. In this study, we aim to improve prediction of operation case duration with advanced machine learning (ML) algorithms. We obtained a large data set containing 170,748 operation cases (from Jan 2017 to Dec 2019) from a hospital. The data covered a broad variety of details on patients, operations, specialties and surgical teams. Meanwhile, a more recent data with 8,672 cases (from Mar to Apr 2020) was also available to be used for external evaluation. We computed historic averages from EMR for surgeon- or procedure-specific and they were used as baseline models for comparison. Subsequently, we developed our models using linear regression, random forest and extreme gradient boosting (XGB) algorithms. All models were evaluated with R-squre (R2), mean absolute error (MAE), and percentage overage (case duration > prediction + 10 % & 15 mins), underage (case duration < prediction - 10 % & 15 mins) and within (otherwise). The XGB model was superior to the other models by having higher R2 (85 %) and percentage within (48 %) as well as lower MAE (30.2 mins). The total prediction errors computed for all the models showed that the XGB model had the lowest inaccurate percent (23.7 %). As a whole, this study applied ML techniques in the field of OR scheduling to reduce medical and financial burden for healthcare management. It revealed the importance of operation and surgeon factors in operation case duration prediction. This study also demonstrated the importance of performing an external evaluation to better validate performance of ML models.

3

Artificial-Intelligence-Enabled Early Malnutrition Risk Assessment Tools for Elderly Trauma Patients in Intensive Care Units

Wei, X.; Xao, X.; Hou, J.; Wang, Q.

2026-04-27 nutrition 10.64898/2026.04.26.26351765 medRxiv

Top 0.1%

53.4%

Show abstract

Background & AimsAccurate assessment of clinical malnutrition using anthropometric and functional indicators could improve the care of elderly trauma patients in intensive care units (ICUs). This study aimed to develop an AI-driven malnutrition assessment toolbox based on a minimal set of clinically feasible indicators. MethodsMultiple machine learning models, including logistic regression, support vector machines, k-nearest neighbors, decision trees, random forests, XGBoost, and neural-network-based ensemble models, were developed using different indicator configurations from a clinically collected patient dataset. Models were trained using baseline and longitudinal measurements to predict malnutrition risk. SHAP analysis was used to interpret the importance of selected indicators. ResultsBaseline (Day 1) data alone did not provide a reliable prediction, whereas longitudinal measurements substantially improved performance. Models based on a minimal indicator set, including bilateral mid-upper arm circumference, calf circumference, and key static variables, outperformed models using the full indicator set. Tree-based methods consistently outperformed linear and distance-based models, with the three-time-point XGBoost achieving the best individual performance. Neural-network-based ensemble models further improved predictive stability. The best overall performance was achieved by the ensemble model using the minimal indicator set from Day 1 and Day 3. SHAP analysis confirmed the importance of the selected indicators. ConclusionsThis AI-driven toolbox provides an efficient and clinically feasible approach for early malnutrition assessment in elderly trauma patients in the ICU. Its strong performance with a minimal indicator set supports its potential for integration into clinical workflows and future digital twin systems for intelligent nutritional management.

4

A Hybrid Data-Driven Approach For Analyzing And Predicting Inpatient Length Of Stay In Health Centers

Chowdhury, T. N.; Mou, S. A.; Rahman, K. N.

2025-02-03 health systems and quality improvement 10.1101/2025.01.30.25321434 medRxiv

Top 0.1%

53.0%

Show abstract

Patient length of stay (LoS) is a critical metric for evaluating the efficacy of hospital management. The primary objectives encompass to improve efficiency and reduce costs while enhancing patient outcomes and hospital capacity within the patient journey. By seamlessly merging data-driven techniques with simulation methodologies, the study proposes an all-encompassing framework for the optimization of patient flow. Using a comprehensive dataset of 2.3 million de-identified patient records, we analyzed demographics, diagnoses, treatments, services, costs, and charges with machine learning models (Decision Tree, Logistic Regression, Random Forest, Adaboost, LightGBM) and Python tools (Spark, AWS clusters, dimensionality reduction). Our model predicts patient length of stay (LoS) upon admission using supervised learning algorithms. This hybrid approach enables the identification of key factors influencing LoS, offering a robust framework for hospitals to streamline patient flow and resource utilization. The research focuses on patient flow corroborates the efficacy of the approach, illustrating decreased patient length of stay within a real healthcare environment. The findings underscore the potential of hybrid data-driven models in transforming hospital management practices. This innovative methodology provides generally flexible decision-making, training, and patient flow enhancement; such a system could have huge implications for healthcare administration and overall satisfaction with healthcare.

5

A Framework for Inferring and Analyzing Pharmacotherapy Treatment Patterns

Rush, E.; Ozmen, O.; Kim, M.; Ortegon, E. R.; Jones, M.; Park, B. H.; Pizer, S.; Trafton, J.; Brenen, L.; Ward, M.; Nebeker, J. R.

2022-07-29 health systems and quality improvement 10.1101/2022.07.27.22277782 medRxiv

Top 0.1%

40.9%

Show abstract

ObjectiveTo discover pharmacotherapy prescription patterns and their statistical associations with outcomes through a clinical pathway inference framework applied on real-world data. Materials and MethodsWe apply machine learning steps in our framework using a 2006 to 2020 cohort of veterans with major depressive disorder (MDD). Outpatient antidepressant pharmacy and emergency department visits, self-harm, and all-cause mortality data were extracted from the Department of Veterans Affairs Corporate Data Warehouse. ResultsOur MDD cohort consisted of 252,179 individuals. During the study period there were 98,417 cases of emergency department visits, 1,016 cases of self-harm, and 1,507 deaths from all causes. The top ten prescription patterns accounted for 69.3% of the data for individuals starting antidepressants at the fluoxetine equivalent of 20-39mg. Additionally, we found associations between outcomes and dosage change. DiscussionFor 252,179 Veterans who served in Iraq and Afghanistan with subsequent MDD noted in their electronic medical record, we documented and described the major pharmacotherapy prescription patterns implemented by VHA providers. Ten patterns accounted for almost 70% of the data. Associations between antidepressant usage and outcomes in observational data may be confounded. The low numbers of adverse events especially associated with all-cause mortality make our calculations imprecise. Furthermore, our outcomes are also the indications for both disease and treatment. Despite these limitations, we demonstrate the usefulness of our framework in providing operational insight into clinical practice, and our results underscore the need for increased monitoring during critical points of treatment.

6

An application of machine learning to assist medication order review by pharmacists in a health care center

Thibault, M.; Lebel, D.

2019-11-27 health informatics 10.1101/19013029 medRxiv

Top 0.1%

38.7%

Show abstract

AO_SCPLOWBSTRACTC_SCPLOWThe objective of this study was to determine if it is feasible to use machine learning to evaluate how a medication order is contextually appropriate for a patient, in order to assist order review by pharmacists. A neural network was constructed using as input the sequence of word2vec embeddings of the 30 previous orders, as well as the currently active medications, pharmacological classes and ordering department, to predict the next order. The model was trained with data from 2013 to 2017, optimized using 5-fold cross-validation, and tested on orders from 2018. A survey was developed to obtain pharmacist ratings on a sample of 20 orders, which were compared with predictions. The training set included 1 022 272 orders. The test set included 95 310 orders. Baseline training set top 1, top 10 and top 30 accuracy using a dummy classifier were respectively 4.5%, 23.6% and 44.1%. Final test set accuracies were, respectively, 44.4%, 69.9% and 80.4%. Populations in which the model performed the best were obstetrics and gynecology patients and newborn babies (either in or out of neonatal intensive care). Pharmacists agreed poorly on their ratings of sampled orders with a Fleiss kappa of 0.283. The breakdown of metrics by population showed better performance in patients following less variable order patterns, indicating potential usefulness in triaging routine orders to less extensive pharmacist review. We conclude that machine learning has potential for helping pharmacists review medication orders. Future studies should aim at evaluating the clinical benefits of using such a model in practice.

7

Prioritising Hospital Complaints: An Innovative Tool Using Large Language Model-Assisted Content Analysis and Machine Learning Algorithms

Sulaiman, M. H.; Muda, N.; Abdul Razak, F.

2025-06-08 health systems and quality improvement 10.1101/2025.06.07.25329193 medRxiv

Top 0.1%

38.2%

Show abstract

BackgroundIn clinical settings, patients often express dissatisfaction through narrative speech or written text. However, most complaints management systems still rely on manual review or rulebased methods that fail to capture the severity or urgency of complaints. This leads to inconsistent triage, delayed resolution and missed opportunities for systemic improvement. A novel model leveraging large language model-assisted content analysis (LACA) and machine learning (ML) can transform subjective narratives into standardized, machine-readable severity scores, facilitating the prioritisation of complaints. ObjectiveThis study aims to (1) determine the precision, recall fscore and accuracy of the proposed predictive models used to classify comments into low-alert and high-alert comment, (2) determine the construct validity and internal consistency (Cronbachs ) of the themes found in LACA conducted on hospital web-based review data, (3) determine the predictors of low-alert and high-alert comments and their ability to change the log-odds of the outcome in logistic regression, and (4) to measure the robustness of the explanatory model measured by pseudo-R2. MethodologyLACA was performed using a set of thematic codes to generate an independent variable dataset (x), with a scale of 0: not an issue, 1: a small issue, 2: a moderate issue, 3: a serious issue, and 4: an extremely serious issue. The independent variables (x) and the dependent variable (y, representing the review rating) were then split into training and testing sets to build predictive ML models. Grid search was used to determine the optimal combination of hyperparameters. The performance of the predictive and explanatory models was evaluated. ResultsML classification was able to produce f1-score of 0.88 - 0.94 and accuracy of 0.92 for LR model; and f1-score of 0.87 - 0.94 and accuracy of 0.92 for ANN model The behaviour of predictive models was successfully explained by the explanatory model: Six (6) themes were determined with cumulative explained variance (CEV) of 0.74 and average Cronbachs of 0.86. LR shows significance on 5 themes with pseudo-R2 of 0.55. ConclusionThis study demonstrates that a data pipeline utilizing LACA and ML algorithms shows excellent performance in classifying patient comments in a hospital setting. All effectiveness parameters including CEV, Cronbachs , precision, recall, f1-score, and accuracy indicate strong performance in differentiating high-alert from low-alert comments.

8

Application of Digital Twin and Heuristic Computer Reasoning to Workflow Management: Gastroenterology Outpatient Centers Study

Garbey, M.; Joerger, G.; Furr, S.

2022-03-23 health systems and quality improvement 10.1101/2022.03.22.22272507 medRxiv

Top 0.1%

37.4%

Show abstract

The workflow in a large medical procedural suite is characterized by high variability of input and suboptimum throughput. Today, Electronic Health Record systems do not address the problem of workflow efficiency: there is still high frustration from medical staff who lack real-time awareness and need to act in response of events based on their personal experiences rather than anticipating. In a medical procedural suite, there are many nonlinear coupling mechanisms between individual tasks that could wrong and therefore is difficult for any individual to control the workflow in real-time or optimize it in the long run. We propose a system approach by creating a digital twin of the procedural suite that assimilates Electronic Health Record data and supports the process of making rational, data-driven, decisions to optimize the workflow on a continuous basis. In this paper, we focus on long term improvements of gastroenterology outpatient centers as a prototype example and use six months of data acquisition in two different clinical sites to validate the artificial intelligence algorithms.

9

Predicting 30-Day In-Hospital Mortality in Surgical Patients: A Logistic Regression Model Using Comprehensive Perioperative Data

Hofmann, J.; Bouras, A.; Patel, D.; Chetla, N.; Balaji, N.; Boulis, M.

2024-05-20 health informatics 10.1101/2024.05.18.24307573 medRxiv

Top 0.1%

37.1%

Show abstract

BackgroundAccurate prediction of postoperative outcomes, particularly 30-day in-hospital mortality, is crucial for improving surgical planning, patient counseling, and resource allocation. This study aimed to develop and validate a logistic regression model to predict 30-day in-hospital mortality using comprehensive perioperative data from the INSPIRE dataset. MethodsWe conducted a retrospective analysis of the INSPIRE dataset, comprising approximately 130,000 surgical cases from Seoul National University Hospital between 2011 and 2020. The primary objective was to develop a logistic regression model using preoperative and intraoperative variables. Key predictors included demographic information, clinical variables, laboratory values, and the emergency status of the operation. Missing data were addressed through multiple imputation, and feature selection was performed using univariate analysis and clinical judgment. The model was validated using cross-validation and assessed for performance using ROC AUC and precision-recall AUC metrics. ResultsThe logistic regression model demonstrated high predictive accuracy, with an ROC AUC of 0.978 and a precision-recall AUC of 0.958. Significant predictors of 30-day in-hospital mortality included emergency status of the operation (OR: 1.56), preoperative prothrombin time (PT/INR) (OR: 1.53), potassium levels (OR: 1.49), body mass index (BMI) (OR: 1.37), serum sodium (OR: 1.11), creatinine levels (OR: 1.04), and albumin levels (OR: 0.85). ConclusionThis study successfully developed and validated a logistic regression model to predict 30-day in-hospital mortality using comprehensive perioperative data. The models high predictive accuracy and reliance on routinely collected clinical and laboratory data enhance its feasibility for integration into existing clinical workflows, providing real-time risk assessments to healthcare providers. Future research should focus on external validation in diverse clinical settings and prospective studies to assess the practical impact of this predictive model.

10

Data Heterogeneity in Federated Learning with Electronic Health Records: Case Studies of Risk Prediction for Acute Kidney Injury and Sepsis Diseases in Critical Care

Rajendran, S.; Xu, Z.; Pan, W.; Ghosh, A.; Wang, F.

2022-09-01 health informatics 10.1101/2022.08.30.22279382 medRxiv

Top 0.1%

37.1%

Show abstract

With the wider availability of healthcare data such as Electronic Health Records (EHR), more and more data-driven based approaches have been proposed to improve the quality of care delivery. Predictive modeling, which aims at building computational models for predicting clinical risk, is a popular research topic in healthcare analytics. However, concerns about privacy of healthcare data may hinder the development of effective predictive models that are generalizable because this often requires rich diverse data from multiple clinical institutions. Recently, federated learning (FL) has demonstrated promise in addressing this concern. However, data heterogeneity from different local participating sites may affect prediction performance. Exploring such heterogeneity of data sources would aid in building accurate risk prediction models in FL. Due to acute kidney injury (AKI) and sepsis high prevalence among patients admitted to intensive care units (ICU), the early prediction of these conditions based on AI is an important topic in critical care medicine. In this study, we take AKI and sepsis onset risk prediction in ICU as two examples to explore the impact of data heterogeneity in the FL framework for risk prediction using EHR data across multiple hospitals. In particular, we built predictive models based on local, pooled, and FL frameworks. The local framework only used data from each site itself. The pooled framework combined data from all sites. In the FL framework, each local site did not have access to other sites data. A model was trained locally and its parameters were shared to a central aggregator, which was used to update the federated models weights and then subsequently, shared with each site. We found models built within a FL framework outperformed local counterparts. Then, we analyzed variable importance discrepancies across sites and frameworks. Finally, we explored potential sources of the heterogeneity within the EHR data. The different distributions of demographic profiles, medication use, and site information contributed to data heterogeneity. Author SummaryThe availability of a large amount of healthcare data such as Electronic Health Records (EHR) and advances of artificial intelligence (AI) techniques provides opportunities to build predictive models for disease risk prediction. Due to the sensitive nature of healthcare data, it is challenging to collect the data together from different hospitals and train a unified model on the combined data. Recent federated learning (FL) demonstrates promise in addressing the fragmented healthcare data sources with privacy-preservation. However, data heterogeneity in the FL framework may influence prediction performance. Exploring the heterogeneity of data sources would contribute to building accurate disease risk prediction models in FL. In this study, we take acute kidney injury (AKI) and sepsis prediction in intensive care units (ICU) as two examples to explore the effects of data heterogeneity in the FL framework for disease risk prediction using EHR data across multiple hospital sites. In particular, multiple predictive models were built based on local, pooled, and FL frameworks. The local framework only used data from each site itself. The pooled framework combined data from all sites. In the FL framework, each local site did not have access to other sites data. We found models built within a FL framework outperformed local counterparts. Then, we analyzed variable importance discrepancies across sites and frameworks. Finally, we explored potential sources of the heterogeneity within EHR data. The different distributions of demographic profiles, medication use, site information such as the type of ICU at admission contributed to data heterogeneity.

11

Building Prediction Models for 30-Day Readmissions Among ICU Patients Using Both Structured and Unstructured Data in Electronic Health Records

Moerschbacher, A.; He, Z.

2021-08-11 health informatics 10.1101/2021.08.10.21261858 medRxiv

Top 0.1%

37.1%

Show abstract

ICU readmissions are associated with poor outcomes for patients and poor performance of hospitals. Patients who are readmitted have an increased risk of in-hospital deaths; hospitals with a higher readmission rate have a reduced profitability, due to an increase in cost and reduced payments from Medicare and Medicaid programs. Predicting a patients likelihood of being readmitted to the ICU can help reduce early discharges, the risk of in-hospital deaths, and help increase profitability. In this study, we built and evaluated multiple machine learning models to predict 30-day readmission rates of ICU patients in the MIMIC-III database. We used both the structured data including demographics, laboratory tests, comorbidities, and unstructured discharge summaries as the predictors and evaluated different combinations of features. The best performing model in this study Logistic Regression achieved an AUROC of 75.7%. This study shows the potential of leveraging machine learning and deep learning for predicting ICU readmissions.

12

Improving patient clustering by incorporating structured label relationships in similarity measures

LAMBERT, J.; Leutenegger, A.-L.; Baudot, A.; Jannot, A.-S.

2023-06-10 health informatics 10.1101/2023.06.06.23291031 medRxiv

Top 0.1%

36.7%

Show abstract

ContextPatient stratification is the cornerstone of numerous health studies, serving to enhance medicine efficacy estimation and facilitate patient matching. To stratify patients, similarity measured between patients can be computed from medical health records databases, such as medico-administrative databases. Importantly, the variables included in medico-administrative databases can be associated with labels, which can be organized in ontologies or other classification systems. However, to the best of our knowledge, the relevance of considering such label classification in the computation of patient similarity measures has been poorly studied. ObjectiveWe propose and evaluate several weighted versions of the Cosine similarity that consider structured label relationships to compute patient similarities from a medico-administrative database. Material and MethodsAs a use case, we analyze medicine reimbursements contained in the Echantillon Generaliste des Beneficiaires, a French medico-administrative database. We compute the standard Cosine similarity between patients based on their medicine reimbursement. In addition, we computed a weighted Cosine similarity measure that includes variable frequencies and two weighted Cosine similarity measures that consider label relationships. We construct patient networks from each similarity measure and identify clusters of patients. We evaluate the performance of the different similarity measures with enrichment tests using information on chronic diseases. ResultsThe similarity measures that include label relationships perform better to identify similar patients. Indeed, using these weighted measures, we identify distinct patient clusters with a higher number of chronic disease enrichments as compared to the other measures. Importantly, the enrichment tests provide clinically interpretable insights into these patient clusters. ConclusionConsidering label relationships when computing patient similarities improves stratification of patients regarding their health status.

13

Machine learning for the prediction of in-hospital mortality in patients with spontaneous intracerebral hemorrhage

Mao, B.; Zhang, R.; Pan, Y.; Zheng, R.; Shen, Y.; Lu, W.; Lu, Y.; Shanhu, X.; Wu, J.; Wang, M.; Wan, S.

2023-08-16 neurology 10.1101/2023.08.15.23294147 medRxiv

Top 0.1%

33.8%

Show abstract

BackgroundsEarly and accurate identification of patients with spontaneous intracerebral hemorrhage(sICH) who are at high risk of in-hospital death can help intensive care unit (ICU) physicians make optimal clinical decisions. The aim of this study was to develop a machine learning(ML)-based tool to predict the risk of in-hospital death in patients with sICH in ICU. MethodsWe conducted a retrospective administrative database study using the MIMIC-IV and Zhejiang Hospital database. The outcome of the study was in-hospital mortality. To develop and validate the final model, we employed the LASSO regression to screen and select relevant variables. Five algorithms, namely Logistic Regression (LR), K-Nearest Neighbors (KNN), Adaptive Boosting (AdaBoost), Random Forest (RF), and eXtreme Gradient Boosting (XGBoost), were utilized. The selection of the best model was based on the area under the curve (AUC) in the validation cohort. Furthermore, we employ the SHapley Additive exPlanations (SHAP) methodology to elucidate the contributions of individual features to the model and analyze their impact on the models outputs. To facilitate accessibility, we also created a visual online calculation page for the model. ResultsIn the final cohort comprising 1596 patients from MIMIC-IV and Zhejiang Hospital, 367 individuals (23%) experienced in-hospital mortality during the inpatient follow-up period. After extracting 46 variables from the database, LASSO regression identified 14 predictor variables for further analysis. Among the five evaluated models, the XGBoost model demonstrated superior discriminative power in both the internal validation set (AUC = 0.907) and the external validation set (AUC = 0.787). Furthermore, through the SHAP technique, we identified the top 5 predictors in the feature importance rankings: Glasgow Coma Scale (GCS), Sequential Organ Failure Assessment (SOFA), anticoagulant medication, mannitol medication and oxygen saturation. ConclusionsAmong the five models, the XGBoost model exhibited superior performance in predicting mortality for patients with sICH in the ICU, indicating its potential significance in the development of early warning systems.

14

A Common Data Model for the standardization of intensive care unit (ICU) medication features in artificial intelligence (AI) applications

Sikora, A.; Keats, K.; Murphy, D. J.; Devlin, J. W.; Smith, S. E.; Murray, B.; Rowe, S.; Coppiano, L.; Kamaleswaran, R.

2023-09-18 intensive care and critical care medicine 10.1101/2023.09.18.23295727 medRxiv

Top 0.1%

33.7%

Show abstract

ObjectiveCommon Data Models provide a standard means of describing data for artificial intelligence (AI) applications, but this process has never been undertaken for medications used in the intensive care unit (ICU). We sought to develop a Common Data Model (CDM) for ICU medications to standardize the medication features needed to support future ICU AI efforts. Materials and MethodsA 9-member, multi-professional team of ICU clinicians and AI experts conducted a 5-round modified Delphi process employing conference calls, web-based communication, and electronic surveys to define the most important medication features for AI efforts. Candidate ICU medication features were generated through group discussion and then independently scored by each team member based on relevance to ICU clinical decision-making and feasibility for collection and coding. A key consideration was to ensure the final ontology both distinguished unique medications and met Findable, Accessible, Interoperable, and Reusable (FAIR) guiding principles. ResultsUsing a list of 889 ICU medications, the team initially generated 106 different medication features, and 71 were ranked as being core features for the CDM. Through this process, 106 medication features were assigned to two key feature domains: drug product-related (n=43) and clinical practice-related (n=63). Each feature included a standardized definition and suggested response values housed in the electronic data library. This CDM for ICU medications is available online. DiscussionThe CDM for ICU medications represents an important first step for the research community focused on exploring how AI can improve patient outcomes and will require ongoing engagement and refinement. Lay SummaryMedication data pose a unique challenge for interpretation by artificial intelligence (AI) because of the alphanumerical combinations (e.g., ibuprofen 200mg every 4 hours) and because of the technical detail associated with drug prescriptions (e.g., ibuprofen 200mg and acetaminophen 325mg are both starting doses and round tablet sizes, so it would be incorrect for the machine to view 325mg as more than 200mg). Because AI has great potential to improve the safety and efficacy of medication use, a common data model for ICU medications (ICURx) is proposed here to overcome these challenges and support AI efforts in medication analysis.

15

Deep Learning-Based Missing Value Imputation for Heart Failure Data from MIMIC-III: A Comparative Study of DAE, SAITS, and MICE+LightGBM

sharma, s.; KAUR, M.; GUPTA, S.

2026-02-11 health systems and quality improvement 10.64898/2026.02.10.26345979 medRxiv

Top 0.1%

33.3%

Show abstract

BackgroundElectronic Health Records(EHR) are very crucial for Clinical Decision Support Systems and for proper care to be delivered to ICU heart failure patients, there is often missing data due to monitoring device errors thus the need for robust imputation methodologies. ObjectiveTo compare and evaluate three different methodologies for imputing missing data for heart failure patients from the MIMIC-III database: Denoising Autoencoder (DAE), Self-Attention Imputation for Time Series (SAITS), and Multiple Imputation by Chained Equations (MICE) with LightGBM. MethodsAnalysis of 14,090 ICU admissions for patients with heart failure was performed using data from the MIMIC-III database. Features were selected based off of clinical relevance, and 19 clinical features were selected through a combination of Random Forest analysis, correlation analysis, and Mutual Information. The introduction of artificial missing values of 20%, 30%, and 50% was applied to the data set, and then 3 imputation methodologies were evaluated with the DAE, SAITS, and MICE+LightGBM. The performance of each imputation methodology was evaluated using Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Normalized Root Mean Square Error (NRMSE). ResultsBoth DAE and SAITS had superior performance on the imputation of missing values across all percentages of missing values. At 20% missingness, DAE had mean MAE = 0.004967, RMSE = 0.005217, and NRMSE = 3.260893 while SAITS had mean MAE = 0.005461, RMSE = 0.005797, and NRMSE = 3.244695; thus MICE+LightGBM resulted in a higher number of errors. At 50% missingness, the SAITS methodology demonstrated the best performance followed by DAE and MICE+LightGBM methods demonstrated decreased performance. The deep learning methodologies maintained a consistent level of accuracy between the clinical variables measured. ConclusionsOur analysis indicates that deep learning-based imputation methodologies significantly outperform traditional methodologies for imputing missing values in ICU heart failure data thus supporting the implementation of these methodologies into Clinical Decision Support Systems for heart failure patients.

16

Evidence-based XAI of clinical decision support systems for differential diagnosis: Design, implementation, and evaluation

Miyachi, Y.; Ishii, O.; Torigoe, K.

2024-07-18 health informatics 10.1101/2024.07.18.24310609 medRxiv

Top 0.1%

33.2%

Show abstract

IntroductionWe propose the Explainable AI (XAI) model for Clinical Decision Support Systems (CDSSs). It supports physicians Differential Diagnosis (DDx) with Evidence-based Medicine (EBM). It identifies instances of the case data contributing to predicted diseases. Each case data is linked to the sourced medical literature. Therefore, this model can provide medical professionals with evidence of predicted diseases. MethodsThe source of the case data (training data) is medical literature. The prediction model (the main model) uses Neural Network (NN) + Learning To Rank (LTR). Physicians DDx and machines LTR are remarkably similar. The XAI model (the surrogate model) uses k-Nearest Neighbors Surrogate model (k-NN Surrogate model). The k-NN Surrogate model is a symphony of Example-based explanations, Local surrogate model, and k-Nearest Neighbors (k-NN). Requirements of the XAI for CDSS and features of the XAI model are remarkably adaptable. To improve the surrogate models performance, it performs "Selecting its data closest to the main model." We evaluated the prediction and XAI performance of the models. ResultsWith the effect of "Selecting," the surrogate models prediction and XAI performances are higher than those of the "standalone" surrogate model. ConclusionsThe k-NN Surrogate model is a useful XAI model for CDSS. For CDSSs with similar aims and features, the k-NN Surrogate model is helpful and easy to implement. The k-NN Surrogate model is an Evidence-based XAI for CDSSs. Unlike current commercial Large Language Models (LLMs), Our CDSS shows evidence of predicted diseases to medical professionals.

17

Enhancing Automated Medical Coding: Evaluating Embedding Models for ICD-10-CM Code Mapping

klotzman, v.

2024-07-03 health informatics 10.1101/2024.07.02.24309849 medRxiv

Top 0.1%

32.6%

Show abstract

PurposeThe goal of this study is to enhance automated medical coding (AMC) by evaluating the effectiveness of modern embedding models in capturing semantic similarity and improving the retrieval process for ICD-10-CM code mapping. Achieving consistent and accurate medical coding practices is crucial for effective healthcare management. MethodsWe compared the performance of embedding models, including text-embedding-3-large, text-embedding-004, voyage-large-2-instruct, and mistralembed, against ClinicalBERT. These models were assessed for their ability to capture semantic similarity between long and short ICD-10-CM descriptions and to improve the retrieval process for mapping diagnosis strings from the eICU database to the correct ICD-10-CM codes. ResultsThe text-embedding-3-large and text-embedding-004 models outperformed ClinicalBERT in capturing semantic similarity, with text-embedding-3-large achieving the highest accuracy. For ICD-10 code retrieval, the voyage-large-2-instruct model demonstrated the best performance. Using the 15 nearest neighbors provided the best results. Increasing the number beyond this did not improve accuracy due to a lack of meaningful information. ConclusionModern embedding models significantly outperform specialized models like ClinicalBERT in AMC tasks. These findings underscore the potential of these models to enhance medical coding practices, in spite of the challenges with ambiguous diagnosis descriptions.

18

A pipeline for tabular dataset formation from unstructured data provided by ACR Appropriateness Criteria guidelines

Eduardo, A.; Loureiro, R. M.; Tachibana, A.; Netto, P.; Almeida, T. F. d.; Monteiro, L. H. A.; Santos, A. P. d.

2022-04-21 health informatics 10.1101/2022.04.20.22274096 medRxiv

Top 0.1%

32.3%

Show abstract

Currently, data performns a critical concept for disparate human activities, from law to technology. Among data-centric technologies, clinical decision support systems (CDSS) figures out as one of the most promising for healthcare. Despite the technological advances facilitating its implementation, the maintainance of knowledge base for CDSS remains open to improvements. Here, we argue that the Appropriateness Criteria provided by ACR guidelines can be used as a open data-source that, combined with appropriate algorithms, can push forward basic research and technological developments regarding knowledge base for CDSS. Therefore, we developed a pipeline capable of forming tabular datasets from ACR guidelines, stored in a web site as textual PDF files. We also experimentally demonstrate that the proposed pipeline successfully recorvers the interested contents, and the best composition, in terms of its component algorithms, is discussed. Future research focused on algorithms flexibility in the face of PDF template updates could improve our work.

19

Inhospital Mortality, Readmission, and Prolonged Length of Stay Risk Prediction Leveraging Historical Electronic Health Records

Bopche, R.; Tuset, L. G.; Afset, J. E.; Ehrnström, B.; Damas, J. K.; Nytro, O.

2024-04-16 health informatics 10.1101/2024.04.15.24305875 medRxiv

Top 0.1%

32.1%

Show abstract

ObjectiveThe aim of this study was to investigate predictive capabilities of historical records of patients maintained at hospitals towards predicting an impending adverse outcomes such as, mortality, readmission, and prolonged length of stay (PLOS). MethodsLeveraging a de-identified dataset from a tertiary care university hospital, we developed a eXplainable Artificial Intelligence (XAI) framework combining tree-based and traditional ML models with interpretations, and statistical analysis of predictors of mortality, readmission, and PLOS. ResultsOur framework demonstrated exceptional predictive performance with notable Area Under the Receiver Operating Characteristic (AUROC) of 0.9625 and Area Under the Precision-Recall Curve (AUPRC) of 0.8575 for 30-day mortality at discharge and an AUROC of 0.9545 and AUPRC of 0.8419 at admission. For the readmission and PLOS risk the highest AUROC achieved were 0.8198 and 0.9797 repectively. The tree-based machine learning (ML) models consistently outperformed the traditional ML models in all the four prediction tasks. The key predictors were age, derived temporal features, routine laboratory tests, and diagnostic and procedural codes. ConclusionThe study underscores the potential of leveraging medical history for enhanced predictive analytics in hospitals. We present a accurate and intuitive framework for early warning models that can be easily implemented in the current and developing digital health platforms to accurately predict adverse outcomes.

20

Data quality and Big Data in the health industry: a scoping review protocol

Tomaz Santos, L. C.; Bublitz, F. M.

2024-10-18 health systems and quality improvement 10.1101/2024.10.18.24315741 medRxiv

Top 0.1%

28.7%

Show abstract

IntroductionBig Data is characterized by the large volume of data, the variety of types and formats, the speed with which they are generated, and the veracity and value that can be extracted from the data. However, the result obtained with this technology will depend on the quality of the information obtained from the data. Big Data has great potential in healthcare and can be used to advance diagnosis, treatment, and healthcare management. Health data is highly vulnerable due to its sensitive nature, as it contains personal and confidential information. If exposed or compromised, it could lead to privacy violations, inaccuracies, misuse, incorrect diagnoses, or misguided decision-making in patient care. It is important to prioritize confidentiality, adhere to regulatory compliance, and maintain data integrity; for that, it is essential to use efficient methods to obtain quality data and make them able to reach the proposed objective. ObjectiveIn this context, the scoping review protocol aims to identify and map existing strategies, methods, or models that improve the quality of medical and health data in Big Data environments. This review explores the methods to support the effective use of Big Data in healthcare while addressing the challenges to maintain data integrity and ensure safe decision-making. Methods and analysisThis scoping review will be conducted based on the six-step process outlined in the framework proposed by Levac et al. in "Scoping Studies: Advancing the methodology" and will be reported following the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) checklist. The research team will use Data Quality, Big Data, and Health terms to search for primary studies in the Scopus Document Search, IEEE Xplore Digital Library, and ACM Digital Library databases.